Target

In this document, we are going to performe an explorative data analysis on the Constitutional Referendum dataset.

Dataset

The given dataset contains the referendum results (number of voters, vote distribution, etc.) stratified by municipality (i.e. Comune).

1 2 3 4 5 6
DESCREGIONE ABRUZZO ABRUZZO ABRUZZO ABRUZZO ABRUZZO ABRUZZO
DESCPROVINCIA CHIETI CHIETI CHIETI CHIETI CHIETI CHIETI
DESCCOMUNE ALTINO ARCHI ARI ARIELLI ATESSA BOMBA
ELETTORI 2288 1785 831 939 8454 686
ELETTORI_M 1101 861 402 453 4121 344
VOTANTI 1496 1241 617 612 5860 467
VOTANTI_M 775 632 328 304 3006 239
NUMVOTISI 533 442 241 194 1952 168
NUMVOTINO 953 782 366 410 3836 297
NUMVOTIBIANCHI 2 3 6 1 45 2
NUMVOTINONVALIDI 8 14 4 7 27 0
NUMVOTICONTESTATI 0 0 0 0 0 0

The dataset has most of the basic data useful to performe an initial analysis on the referendum. Howerver, in order to gain some more insights and especially in order to be able to plot on a geographic map, we import an additional dataset (found here: http://ckan.ancitel.it/dataset/comuni-italiani-dati-territoriali-e-demografici ).

1 2 3 4 5 6
Comune ABANO TERME ABBADIA CERRETO ABBADIA LARIANA ABBADIA SAN SALVATORE ABBASANTA ABBATEGGIO
ISTAT 28001 98001 97001 52001 95001 68001
Provincia PADOVA LODI LECCO SIENA ORISTANO PESCARA
SiglaProv PD LO LC SI OR PE
Regione VENETO LOMBARDIA LOMBARDIA TOSCANA SARDEGNA ABRUZZO
AreaGeo Nord-Est Nord-Ovest Nord-Ovest Centro Isole Sud
PopResidente 19950 289 3200 6444 2747 400
PopStraniera 2037 17 156 632 72 16
DensitaDemografica 932.64 47.91 193.55 110.16 70.54 26.36
SuperficieKmq 21.408 6.199 16.673 58.994 39.847 15.402
AltezzaCentro 14 64 204 822 315 450
AltezzaMinima 9 62 199 307 269 190
AltezzaMassima 80 70 1700 1738 483 1150
ZonaAltimetrica Montagna Interna Pianura Montagna Interna Montagna Interna Collina Interna Collina Interna
TipoComune No capoluogo No capoluogo No capoluogo No capoluogo No capoluogo No capoluogo
GradoUrbaniz Elevato Medio Medio Basso Basso Basso
IndiceMontanita Non montano Non montano Totalmente montano Totalmente montano Totalmente montano Totalmente montano
ZonaClimatica E E E E C D
ZonaSismica 4 4 4 2 4 1
ClasseComune Polo di attrazione intercomunale Area di cintura Area periferica Area periferica Area intermedia Area intermedia
Latitudine 45.35944 45.31222 45.89917 42.88000 40.12500 42.22361
Longitudine 11.789444 9.592778 9.333611 11.677500 8.820000 14.011389

The dataset we will make use of is obtained by merging these two tables. In order to make this join properly working, it has been necessary to do some pre-processing on the data, reconciling the names of some municipalities. Considered the pointwise nature of this task, the operation has been done by hand.

Plots

In this section we show some exploratory plots.

First, the distribution of the percentage of voters with respect to the total number of electors. As can be seen, there was generally an high rate of attendance.

Sex Distribution

For what concerns the attendance rate given sex, the next plot shows how generally males voted than females. We choose to plot also the mean value of the two population, even though they are highly skewed on the right.

We tested the difference between the two means using a t-test, which confirm the conjecture based on the histogram.

## 
##  Welch Two Sample t-test
## 
## data:  (dati$VOTANTI_M/dati$ELETTORI_M) and (dati$VOTANTI_F/dati$ELETTORI_F)
## t = 32.632, df = 15677, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.04383501 0.04943773
## sample estimates:
## mean of x mean of y 
## 0.7081189 0.6614825

Vote distribution

The following plot exibits the vote distrubution, stratified by geographic area. It is possible to see how generally the number of Pro is not above the 50% of the total. The two main exceptions are the central and the nord-east regions.

The number of not blank, contested or not valid votes is generally very low (as expected). If we consider for example the municipalities where the percentage of not valid votes is higher than 5% (that is, in the right tail of the distribution), we can observe they are not really outliers. Instead, this “high” value is probably due to the low number of possible voters.

4600 5465 5572
DESCREGIONE PIEMONTE PIEMONTE PIEMONTE
DESCCOMUNE BRIGA ALTA SEROLE VANZONE CON SAN CARLO
ELETTORI 36 99 354
VOTANTI 21 56 217
NUMVOTISI 2 20 96
NUMVOTINO 17 33 103
NUMVOTIBIANCHI 0 0 0
NUMVOTINONVALIDI 2 3 18
NUMVOTICONTESTATI 0 0 0
ClasseComune Area ultra-periferica Area periferica Area intermedia
PERC_NONVALIDI 0.09523810 0.05357143 0.08294931

Geographic distribution

The next geographic map highlight how the Pros and the Cons are distributed. The main (restricted) areas where the Pros win are in Toscana, Emilia Romagna and Trentino - Alto Adige.

More maps are available in the Tableau file.

Aggregated

DESCREGIONE ELETTORI VOTANTI NUMVOTISI PERC_VOTANTI PERC_SI
ABRUZZO 1052049 722930 255001 0.6871638 0.3527326
BASILICATA 467000 293546 98924 0.6285782 0.3369966
CALABRIA 1549305 842992 275449 0.5441098 0.3267516
CAMPANIA 4566905 2689070 839692 0.5888167 0.3122611
EMILIA-ROMAGNA 3326910 2526230 1262484 0.7593322 0.4997502
FRIULI-VENEZIA GIULIA 952494 690717 267357 0.7251668 0.3870717
LAZIO 4402145 3044673 1108768 0.6916340 0.3641665
LIGURIA 1241469 865756 342671 0.6973642 0.3958055
LOMBARDIA 7480375 5552510 2452936 0.7422770 0.4417707
MARCHE 1189181 866233 385768 0.7284282 0.4453398
MOLISE 256600 164038 63695 0.6392751 0.3882942
PIEMONTE 3396378 2446664 1054749 0.7203745 0.4310968
PUGLIA 3280712 2024651 659354 0.6171377 0.3256630
SARDEGNA 1375735 859158 237280 0.6245084 0.2761774
SICILIA 4013248 2271850 639629 0.5660876 0.2815454
TOSCANA 2854129 2125053 1105769 0.7445539 0.5203489
TRENTINO-ALTO ADIGE 792504 572486 305322 0.7223762 0.5333266
UMBRIA 675610 496406 240346 0.7347523 0.4841722
VALLE D’AOSTA 99735 71717 30568 0.7190756 0.4262309
VENETO 3720717 2852591 1077247 0.7666778 0.3776381

The next two plots display the national distribution of the votes. More than 13 million voters (one over three) didn’t expressed an opinion.